Overview

Dataset statistics

Number of variables25
Number of observations37709
Missing cells334573
Missing cells (%)35.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory28.6 MiB
Average record size in memory795.9 B

Variable types

Categorical10
Numeric12
Unsupported3

Alerts

ClaseVehiculo__c has constant value "99999" Constant
TipoVehiculo__c has constant value "99999" Constant
n_prod_prev is highly correlated with total_siniestros and 2 other fieldsHigh correlation
total_siniestros is highly correlated with n_prod_prev and 2 other fieldsHigh correlation
total_pagado_smmlv is highly correlated with n_prod_prev and 3 other fieldsHigh correlation
anios_ultimo_siniestro is highly correlated with n_prod_prev and 2 other fieldsHigh correlation
Activos__c is highly correlated with AnnualRevenueHigh correlation
AnnualRevenue is highly correlated with Activos__c and 1 other fieldsHigh correlation
MontoAnual__c is highly correlated with total_pagado_smmlvHigh correlation
EgresosAnuales__c is highly correlated with AnnualRevenueHigh correlation
total_siniestros is highly correlated with total_pagado_smmlv and 1 other fieldsHigh correlation
total_pagado_smmlv is highly correlated with total_siniestros and 1 other fieldsHigh correlation
anios_ultimo_siniestro is highly correlated with total_siniestros and 1 other fieldsHigh correlation
AnnualRevenue is highly correlated with EgresosAnuales__cHigh correlation
EgresosAnuales__c is highly correlated with AnnualRevenueHigh correlation
n_prod_prev is highly correlated with total_siniestrosHigh correlation
total_siniestros is highly correlated with n_prod_prev and 2 other fieldsHigh correlation
total_pagado_smmlv is highly correlated with total_siniestros and 2 other fieldsHigh correlation
anios_ultimo_siniestro is highly correlated with total_siniestros and 1 other fieldsHigh correlation
AnnualRevenue is highly correlated with EgresosAnuales__cHigh correlation
MontoAnual__c is highly correlated with total_pagado_smmlvHigh correlation
EgresosAnuales__c is highly correlated with AnnualRevenueHigh correlation
churn is highly correlated with TipoVehiculo__c and 2 other fieldsHigh correlation
TipoVehiculo__c is highly correlated with churn and 8 other fieldsHigh correlation
Genero__pc is highly correlated with TipoVehiculo__c and 3 other fieldsHigh correlation
ClaseVehiculo__c is highly correlated with churn and 8 other fieldsHigh correlation
FechaInicioVigencia__ctrim is highly correlated with churn and 2 other fieldsHigh correlation
tipo_prod_desc is highly correlated with TipoVehiculo__c and 2 other fieldsHigh correlation
tipo_ramo_name is highly correlated with TipoVehiculo__c and 2 other fieldsHigh correlation
EstadoCivil__pc is highly correlated with TipoVehiculo__c and 3 other fieldsHigh correlation
ciudad_name is highly correlated with TipoVehiculo__c and 2 other fieldsHigh correlation
CodigoTipoAsegurado__c is highly correlated with TipoVehiculo__c and 4 other fieldsHigh correlation
PuntoVenta__c is highly correlated with n_prod_prev and 3 other fieldsHigh correlation
tipo_ramo_name is highly correlated with tipo_prod_desc and 1 other fieldsHigh correlation
tipo_prod_desc is highly correlated with tipo_ramo_name and 1 other fieldsHigh correlation
NumeroPoliza__c is highly correlated with tipo_ramo_name and 1 other fieldsHigh correlation
FechaInicioVigencia__ctrim is highly correlated with churnHigh correlation
churn is highly correlated with FechaInicioVigencia__ctrimHigh correlation
n_prod_prev is highly correlated with PuntoVenta__c and 2 other fieldsHigh correlation
total_siniestros is highly correlated with PuntoVenta__c and 2 other fieldsHigh correlation
total_pagado_smmlv is highly correlated with PuntoVenta__c and 3 other fieldsHigh correlation
anios_ultimo_siniestro is highly correlated with PuntoVenta__c and 2 other fieldsHigh correlation
AnnualRevenue is highly correlated with OtrosIngresos__c and 1 other fieldsHigh correlation
OtrosIngresos__c is highly correlated with anios_ultimo_siniestro and 1 other fieldsHigh correlation
EgresosAnuales__c is highly correlated with AnnualRevenue and 1 other fieldsHigh correlation
EstadoCivil__pc is highly correlated with Genero__pcHigh correlation
Genero__pc is highly correlated with EgresosAnuales__c and 1 other fieldsHigh correlation
MarcaVehiculo__c has 37709 (100.0%) missing values Missing
MdeloVehiculo__c has 37709 (100.0%) missing values Missing
n_prod_prev has 1218 (3.2%) missing values Missing
total_siniestros has 21451 (56.9%) missing values Missing
total_pagado_smmlv has 21451 (56.9%) missing values Missing
anios_ultimo_siniestro has 21451 (56.9%) missing values Missing
Activos__c has 14609 (38.7%) missing values Missing
AnnualRevenue has 14609 (38.7%) missing values Missing
MontoAnual__c has 37682 (99.9%) missing values Missing
OtrosIngresos__c has 16905 (44.8%) missing values Missing
Profesion__pc has 37709 (100.0%) missing values Missing
EgresosAnuales__c has 14609 (38.7%) missing values Missing
EstadoCivil__pc has 14347 (38.0%) missing values Missing
Genero__pc has 14347 (38.0%) missing values Missing
ciudad_name has 14347 (38.0%) missing values Missing
edad has 14420 (38.2%) missing values Missing
Activos__c is highly skewed (γ1 = 103.1264111) Skewed
AnnualRevenue is highly skewed (γ1 = 23.01222549) Skewed
OtrosIngresos__c is highly skewed (γ1 = 81.18737292) Skewed
MarcaVehiculo__c is an unsupported type, check if it needs cleaning or further analysis Unsupported
MdeloVehiculo__c is an unsupported type, check if it needs cleaning or further analysis Unsupported
Profesion__pc is an unsupported type, check if it needs cleaning or further analysis Unsupported
total_pagado_smmlv has 1856 (4.9%) zeros Zeros
OtrosIngresos__c has 19756 (52.4%) zeros Zeros

Reproduction

Analysis started2022-06-08 19:27:35.425034
Analysis finished2022-06-08 19:33:27.220190
Duration5 minutes and 51.8 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

CodigoTipoAsegurado__c
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
1
36863 
4
 
625
3
 
155
2
 
66

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters37709
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
136863
97.8%
4625
 
1.7%
3155
 
0.4%
266
 
0.2%

Length

2022-06-08T14:33:27.269193image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-08T14:33:27.345690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
136863
97.8%
4625
 
1.7%
3155
 
0.4%
266
 
0.2%

Most occurring characters

ValueCountFrequency (%)
136863
97.8%
4625
 
1.7%
3155
 
0.4%
266
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number37709
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
136863
97.8%
4625
 
1.7%
3155
 
0.4%
266
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common37709
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
136863
97.8%
4625
 
1.7%
3155
 
0.4%
266
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII37709
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
136863
97.8%
4625
 
1.7%
3155
 
0.4%
266
 
0.2%

PuntoVenta__c
Real number (ℝ≥0)

HIGH CORRELATION

Distinct283
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3909.493569
Minimum5
Maximum20007
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size294.7 KiB
2022-06-08T14:33:27.422189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile103
Q11048
median2468
Q39467
95-th percentile9721
Maximum20007
Range20002
Interquartile range (IQR)8419

Descriptive statistics

Standard deviation3649.840905
Coefficient of variation (CV)0.9335840667
Kurtosis-1.068064972
Mean3909.493569
Median Absolute Deviation (MAD)1468
Skewness0.7700464119
Sum147423093
Variance13321338.63
MonotonicityNot monotonic
2022-06-08T14:33:27.516689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
97217161
19.0%
33012106
 
5.6%
10481758
 
4.7%
32021697
 
4.5%
1031531
 
4.1%
4041501
 
4.0%
24661433
 
3.8%
3011261
 
3.3%
35021250
 
3.3%
15031015
 
2.7%
Other values (273)16996
45.1%
ValueCountFrequency (%)
5175
 
0.5%
832
 
0.1%
954
 
0.1%
14140
 
0.4%
16121
 
0.3%
23504
1.3%
25126
 
0.3%
2692
 
0.2%
1001
 
< 0.1%
1029
 
< 0.1%
ValueCountFrequency (%)
200072
 
< 0.1%
101112
 
< 0.1%
99778
 
< 0.1%
99742
 
< 0.1%
99731
 
< 0.1%
99725
 
< 0.1%
997175
0.2%
99704
 
< 0.1%
996951
0.1%
99679
 
< 0.1%

tipo_ramo_name
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
automoviles
34397 
previhogar
 
3216
responsabilidad civil
 
96

Length

Max length21
Median length11
Mean length10.94017343
Min length10

Characters and Unicode

Total characters412543
Distinct characters19
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowprevihogar
2nd rowprevihogar
3rd rowprevihogar
4th rowprevihogar
5th rowprevihogar

Common Values

ValueCountFrequency (%)
automoviles34397
91.2%
previhogar3216
 
8.5%
responsabilidad civil96
 
0.3%

Length

2022-06-08T14:33:27.606188image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-08T14:33:27.679689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
automoviles34397
91.0%
previhogar3216
 
8.5%
responsabilidad96
 
0.3%
civil96
 
0.3%

Most occurring characters

ValueCountFrequency (%)
o72106
17.5%
i37997
9.2%
a37805
9.2%
e37709
9.1%
v37709
9.1%
l34589
8.4%
s34589
8.4%
m34397
8.3%
t34397
8.3%
u34397
8.3%
Other values (9)16848
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter412447
> 99.9%
Space Separator96
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o72106
17.5%
i37997
9.2%
a37805
9.2%
e37709
9.1%
v37709
9.1%
l34589
8.4%
s34589
8.4%
m34397
8.3%
t34397
8.3%
u34397
8.3%
Other values (8)16752
 
4.1%
Space Separator
ValueCountFrequency (%)
96
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin412447
> 99.9%
Common96
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o72106
17.5%
i37997
9.2%
a37805
9.2%
e37709
9.1%
v37709
9.1%
l34589
8.4%
s34589
8.4%
m34397
8.3%
t34397
8.3%
u34397
8.3%
Other values (8)16752
 
4.1%
Common
ValueCountFrequency (%)
96
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII412543
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o72106
17.5%
i37997
9.2%
a37805
9.2%
e37709
9.1%
v37709
9.1%
l34589
8.4%
s34589
8.4%
m34397
8.3%
t34397
8.3%
u34397
8.3%
Other values (9)16848
 
4.1%

tipo_prod_desc
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
automoviles
34397 
previhogar
 
3216
profesionales medicos
 
56
directores y administradores
 
38
servidores publicos
 
2

Length

Max length28
Median length11
Mean length10.94712138
Min length10

Characters and Unicode

Total characters412805
Distinct characters21
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowprevihogar
2nd rowprevihogar
3rd rowprevihogar
4th rowprevihogar
5th rowprevihogar

Common Values

ValueCountFrequency (%)
automoviles34397
91.2%
previhogar3216
 
8.5%
profesionales medicos56
 
0.1%
directores y administradores38
 
0.1%
servidores publicos2
 
< 0.1%

Length

2022-06-08T14:33:27.749689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-08T14:33:27.831191image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
automoviles34397
90.9%
previhogar3216
 
8.5%
profesionales56
 
0.1%
medicos56
 
0.1%
directores38
 
0.1%
y38
 
0.1%
administradores38
 
0.1%
servidores2
 
< 0.1%
publicos2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o72258
17.5%
e37899
9.2%
i37843
9.2%
a37745
9.1%
v37615
9.1%
s34685
8.4%
m34491
8.4%
t34473
8.4%
l34455
8.3%
u34399
8.3%
Other values (11)16942
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter412671
> 99.9%
Space Separator134
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o72258
17.5%
e37899
9.2%
i37843
9.2%
a37745
9.1%
v37615
9.1%
s34685
8.4%
m34491
8.4%
t34473
8.4%
l34455
8.3%
u34399
8.3%
Other values (10)16808
 
4.1%
Space Separator
ValueCountFrequency (%)
134
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin412671
> 99.9%
Common134
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o72258
17.5%
e37899
9.2%
i37843
9.2%
a37745
9.1%
v37615
9.1%
s34685
8.4%
m34491
8.4%
t34473
8.4%
l34455
8.3%
u34399
8.3%
Other values (10)16808
 
4.1%
Common
ValueCountFrequency (%)
134
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII412805
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o72258
17.5%
e37899
9.2%
i37843
9.2%
a37745
9.1%
v37615
9.1%
s34685
8.4%
m34491
8.4%
t34473
8.4%
l34455
8.3%
u34399
8.3%
Other values (11)16942
 
4.1%

ClaseVehiculo__c
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
99999
37709 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters188545
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row99999
2nd row99999
3rd row99999
4th row99999
5th row99999

Common Values

ValueCountFrequency (%)
9999937709
100.0%

Length

2022-06-08T14:33:27.903189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-08T14:33:27.971189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
9999937709
100.0%

Most occurring characters

ValueCountFrequency (%)
9188545
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number188545
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9188545
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common188545
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9188545
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII188545
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9188545
100.0%

MarcaVehiculo__c
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing37709
Missing (%)100.0%
Memory size294.7 KiB

MdeloVehiculo__c
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing37709
Missing (%)100.0%
Memory size294.7 KiB

TipoVehiculo__c
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
99999
37709 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters188545
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row99999
2nd row99999
3rd row99999
4th row99999
5th row99999

Common Values

ValueCountFrequency (%)
9999937709
100.0%

Length

2022-06-08T14:33:28.026689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-08T14:33:28.093689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
9999937709
100.0%

Most occurring characters

ValueCountFrequency (%)
9188545
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number188545
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9188545
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common188545
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9188545
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII188545
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9188545
100.0%

NumeroPoliza__c
Real number (ℝ≥0)

HIGH CORRELATION

Distinct29609
Distinct (%)78.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2863543.321
Minimum1000001
Maximum3173059
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size294.7 KiB
2022-06-08T14:33:28.162189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1000001
5-th percentile1004451.8
Q13006964
median3029704
Q33075038
95-th percentile3129598.6
Maximum3173059
Range2173058
Interquartile range (IQR)68074

Descriptive statistics

Standard deviation584922.3783
Coefficient of variation (CV)0.2042652451
Kurtosis6.165415324
Mean2863543.321
Median Absolute Deviation (MAD)25132
Skewness-2.847314158
Sum1.079813551 × 1011
Variance3.421341887 × 1011
MonotonicityNot monotonic
2022-06-08T14:33:28.252189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10004087
 
< 0.1%
30078866
 
< 0.1%
30079006
 
< 0.1%
30079536
 
< 0.1%
10036796
 
< 0.1%
30079346
 
< 0.1%
30079236
 
< 0.1%
30047726
 
< 0.1%
30046706
 
< 0.1%
30078946
 
< 0.1%
Other values (29599)37648
99.8%
ValueCountFrequency (%)
10000012
< 0.1%
10000022
< 0.1%
10000031
< 0.1%
10000041
< 0.1%
10000052
< 0.1%
10000061
< 0.1%
10000071
< 0.1%
10000222
< 0.1%
10000231
< 0.1%
10000241
< 0.1%
ValueCountFrequency (%)
31730591
< 0.1%
31730561
< 0.1%
31728551
< 0.1%
31728301
< 0.1%
31728031
< 0.1%
31719511
< 0.1%
31717901
< 0.1%
31707231
< 0.1%
31695121
< 0.1%
31695101
< 0.1%

FechaInicioVigencia__ctrim
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
03-2018
6043 
02-2018
5772 
02-2019
5340 
01-2019
5164 
03-2020
4583 
Other values (6)
10807 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters263963
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row01-2018
2nd row01-2018
3rd row01-2018
4th row01-2018
5th row01-2018

Common Values

ValueCountFrequency (%)
03-20186043
16.0%
02-20185772
15.3%
02-20195340
14.2%
01-20195164
13.7%
03-20204583
12.2%
01-20184464
11.8%
02-20214225
11.2%
01-20212083
 
5.5%
03-201926
 
0.1%
02-20207
 
< 0.1%

Length

2022-06-08T14:33:28.335190image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
03-20186043
16.0%
02-20185772
15.3%
02-20195340
14.2%
01-20195164
13.7%
03-20204583
12.2%
01-20184464
11.8%
02-20214225
11.2%
01-20212083
 
5.5%
03-201926
 
0.1%
02-20207
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
080010
30.3%
263953
24.2%
144830
17.0%
-37709
14.3%
816279
 
6.2%
310652
 
4.0%
910530
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number226254
85.7%
Dash Punctuation37709
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
080010
35.4%
263953
28.3%
144830
19.8%
816279
 
7.2%
310652
 
4.7%
910530
 
4.7%
Dash Punctuation
ValueCountFrequency (%)
-37709
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common263963
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
080010
30.3%
263953
24.2%
144830
17.0%
-37709
14.3%
816279
 
6.2%
310652
 
4.0%
910530
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII263963
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
080010
30.3%
263953
24.2%
144830
17.0%
-37709
14.3%
816279
 
6.2%
310652
 
4.0%
910530
 
4.0%

churn
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
1
26130 
0
11579 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters37709
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
126130
69.3%
011579
30.7%

Length

2022-06-08T14:33:28.401190image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-08T14:33:28.469692image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
126130
69.3%
011579
30.7%

Most occurring characters

ValueCountFrequency (%)
126130
69.3%
011579
30.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number37709
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
126130
69.3%
011579
30.7%

Most occurring scripts

ValueCountFrequency (%)
Common37709
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
126130
69.3%
011579
30.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII37709
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
126130
69.3%
011579
30.7%

n_prod_prev
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct11
Distinct (%)< 0.1%
Missing1218
Missing (%)3.2%
Infinite0
Infinite (%)0.0%
Mean3.041434874
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size294.7 KiB
2022-06-08T14:33:28.522189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q35
95-th percentile10
Maximum16
Range15
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.174608732
Coefficient of variation (CV)1.043786523
Kurtosis7.148963851
Mean3.041434874
Median Absolute Deviation (MAD)1
Skewness2.60820907
Sum110985
Variance10.0781406
MonotonicityNot monotonic
2022-06-08T14:33:28.579190image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
114855
39.4%
28460
22.4%
57592
20.1%
32473
 
6.6%
141035
 
2.7%
4896
 
2.4%
16772
 
2.0%
10207
 
0.5%
6128
 
0.3%
856
 
0.1%
(Missing)1218
 
3.2%
ValueCountFrequency (%)
114855
39.4%
28460
22.4%
32473
 
6.6%
4896
 
2.4%
57592
20.1%
6128
 
0.3%
717
 
< 0.1%
856
 
0.1%
10207
 
0.5%
141035
 
2.7%
ValueCountFrequency (%)
16772
 
2.0%
141035
 
2.7%
10207
 
0.5%
856
 
0.1%
717
 
< 0.1%
6128
 
0.3%
57592
20.1%
4896
 
2.4%
32473
 
6.6%
28460
22.4%

total_siniestros
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct74
Distinct (%)0.5%
Missing21451
Missing (%)56.9%
Infinite0
Infinite (%)0.0%
Mean1594.383688
Minimum1
Maximum3466
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size294.7 KiB
2022-06-08T14:33:28.659688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median704
Q33466
95-th percentile3466
Maximum3466
Range3465
Interquartile range (IQR)3464

Descriptive statistics

Standard deviation1681.585853
Coefficient of variation (CV)1.054693337
Kurtosis-1.933381206
Mean1594.383688
Median Absolute Deviation (MAD)703
Skewness0.1978923917
Sum25921490
Variance2827730.981
MonotonicityNot monotonic
2022-06-08T14:33:28.747187image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
34667219
 
19.1%
14059
 
10.8%
21058
 
2.8%
7041035
 
2.7%
136772
 
2.0%
3455
 
1.2%
4288
 
0.8%
5262
 
0.7%
92207
 
0.5%
7168
 
0.4%
Other values (64)735
 
1.9%
(Missing)21451
56.9%
ValueCountFrequency (%)
14059
10.8%
21058
 
2.8%
3455
 
1.2%
4288
 
0.8%
5262
 
0.7%
685
 
0.2%
7168
 
0.4%
828
 
0.1%
974
 
0.2%
1064
 
0.2%
ValueCountFrequency (%)
34667219
19.1%
29724
 
< 0.1%
26981
 
< 0.1%
13111
 
< 0.1%
7701
 
< 0.1%
7041035
 
2.7%
5435
 
< 0.1%
2151
 
< 0.1%
1791
 
< 0.1%
1453
 
< 0.1%

total_pagado_smmlv
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct2424
Distinct (%)14.9%
Missing21451
Missing (%)56.9%
Infinite0
Infinite (%)0.0%
Mean11536.93594
Minimum0
Maximum64793.74852
Zeros1856
Zeros (%)4.9%
Negative0
Negative (%)0.0%
Memory size294.7 KiB
2022-06-08T14:33:28.842190image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q19.045556586
median5779.809377
Q324921.89139
95-th percentile24921.89139
Maximum64793.74852
Range64793.74852
Interquartile range (IQR)24912.84583

Descriptive statistics

Standard deviation12080.92058
Coefficient of variation (CV)1.04715157
Kurtosis-1.847329363
Mean11536.93594
Median Absolute Deviation (MAD)5779.809377
Skewness0.2074928349
Sum187567504.6
Variance145948642.2
MonotonicityNot monotonic
2022-06-08T14:33:28.929689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24921.891397219
 
19.1%
01856
 
4.9%
5779.8093771035
 
2.7%
1308.359896772
 
2.0%
506.8921899207
 
0.5%
41.5516750387
 
0.2%
133.455468371
 
0.2%
205.163836154
 
0.1%
794.172988332
 
0.1%
289.533928829
 
0.1%
Other values (2414)4896
 
13.0%
(Missing)21451
56.9%
ValueCountFrequency (%)
01856
4.9%
0.011439345982
 
< 0.1%
0.023414137341
 
< 0.1%
0.058392310336
 
< 0.1%
0.06497090531
 
< 0.1%
0.066517251211
 
< 0.1%
0.0686159912
 
< 0.1%
0.074825444894
 
< 0.1%
0.08072274669
 
< 0.1%
0.082080840511
 
< 0.1%
ValueCountFrequency (%)
64793.748524
 
< 0.1%
24921.891397219
19.1%
5779.8093771035
 
2.7%
3629.204432
 
< 0.1%
2962.7489645
 
< 0.1%
2501.8172042
 
< 0.1%
1308.359896772
 
2.0%
1295.6617331
 
< 0.1%
1194.0444192
 
< 0.1%
1124.7842813
 
< 0.1%

anios_ultimo_siniestro
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct1151
Distinct (%)7.1%
Missing21451
Missing (%)56.9%
Infinite0
Infinite (%)0.0%
Mean0.6779281686
Minimum0.002739726027
Maximum10.69863014
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size294.7 KiB
2022-06-08T14:33:29.024689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.002739726027
5-th percentile0.002739726027
Q10.002739726027
median0.03287671233
Q31.290410959
95-th percentile2.783561644
Maximum10.69863014
Range10.69589041
Interquartile range (IQR)1.287671233

Descriptive statistics

Standard deviation1.046954504
Coefficient of variation (CV)1.544344302
Kurtosis5.311930213
Mean0.6779281686
Median Absolute Deviation (MAD)0.0301369863
Skewness1.820966177
Sum11021.75616
Variance1.096113734
MonotonicityNot monotonic
2022-06-08T14:33:29.114190image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0027397260277223
 
19.2%
0.06027397261044
 
2.8%
0.008219178082803
 
2.1%
0.04383561644209
 
0.6%
0.0876712328892
 
0.2%
0.432876712390
 
0.2%
0.153424657567
 
0.2%
0.0767123287747
 
0.1%
1.44931506835
 
0.1%
1.68493150732
 
0.1%
Other values (1141)6616
 
17.5%
(Missing)21451
56.9%
ValueCountFrequency (%)
0.0027397260277223
19.2%
0.00547945205513
 
< 0.1%
0.008219178082803
 
2.1%
0.0109589041115
 
< 0.1%
0.0136986301423
 
0.1%
0.0164383561611
 
< 0.1%
0.019178082194
 
< 0.1%
0.024657534259
 
< 0.1%
0.027397260272
 
< 0.1%
0.030136986325
 
0.1%
ValueCountFrequency (%)
10.698630142
 
< 0.1%
10.002739736
< 0.1%
9.0958904111
 
< 0.1%
7.9232876711
 
< 0.1%
7.5095890412
 
< 0.1%
7.2821917811
 
< 0.1%
7.0821917812
 
< 0.1%
6.972602741
 
< 0.1%
6.813698631
 
< 0.1%
6.6684931512
 
< 0.1%

Activos__c
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
SKEWED

Distinct3955
Distinct (%)17.1%
Missing14609
Missing (%)38.7%
Infinite0
Infinite (%)0.0%
Mean437803920.1
Minimum0
Maximum1 × 1012
Zeros24
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size294.7 KiB
2022-06-08T14:33:29.205189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile22000000
Q162900000
median126000000
Q3280000000
95-th percentile1040376850
Maximum1 × 1012
Range1 × 1012
Interquartile range (IQR)217100000

Descriptive statistics

Standard deviation9434661146
Coefficient of variation (CV)21.54996955
Kurtosis10909.77113
Mean437803920.1
Median Absolute Deviation (MAD)76000000
Skewness103.1264111
Sum1.011327055 × 1013
Variance8.901283094 × 1019
MonotonicityNot monotonic
2022-06-08T14:33:29.296690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000000001139
 
3.0%
80000000859
 
2.3%
200000000805
 
2.1%
150000000801
 
2.1%
50000000687
 
1.8%
120000000633
 
1.7%
60000000592
 
1.6%
40000000494
 
1.3%
90000000470
 
1.2%
30000000464
 
1.2%
Other values (3945)16156
42.8%
(Missing)14609
38.7%
ValueCountFrequency (%)
024
 
0.1%
176
0.2%
23
 
< 0.1%
202
 
< 0.1%
403
 
< 0.1%
501
 
< 0.1%
801
 
< 0.1%
1003
 
< 0.1%
1041
 
< 0.1%
3502
 
< 0.1%
ValueCountFrequency (%)
1 × 10122
< 0.1%
1 × 10112
< 0.1%
5.835 × 10101
 
< 0.1%
5.43154931 × 10101
 
< 0.1%
4.479 × 10101
 
< 0.1%
4.0078653 × 10104
< 0.1%
3.1092794 × 10103
< 0.1%
2.8125029 × 10102
< 0.1%
2.43 × 10101
 
< 0.1%
2.0615603 × 10101
 
< 0.1%

AnnualRevenue
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
SKEWED

Distinct3904
Distinct (%)16.9%
Missing14609
Missing (%)38.7%
Infinite0
Infinite (%)0.0%
Mean235281369.7
Minimum0
Maximum8.63 × 1010
Zeros8
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size294.7 KiB
2022-06-08T14:33:29.388688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile8500000
Q126000000
median45000000
Q384000000
95-th percentile538284000
Maximum8.63 × 1010
Range8.63 × 1010
Interquartile range (IQR)58000000

Descriptive statistics

Standard deviation1615782969
Coefficient of variation (CV)6.86744969
Kurtosis790.166724
Mean235281369.7
Median Absolute Deviation (MAD)23000000
Skewness23.01222549
Sum5.434999639 × 1012
Variance2.610754603 × 1018
MonotonicityNot monotonic
2022-06-08T14:33:29.473189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
360000001175
 
3.1%
60000000921
 
2.4%
24000000898
 
2.4%
30000000834
 
2.2%
48000000708
 
1.9%
18000000490
 
1.3%
50000000434
 
1.2%
12000000408
 
1.1%
40000000407
 
1.1%
42000000342
 
0.9%
Other values (3894)16483
43.7%
(Missing)14609
38.7%
ValueCountFrequency (%)
08
 
< 0.1%
128
0.1%
202
 
< 0.1%
522301
 
< 0.1%
2500001
 
< 0.1%
3280001
 
< 0.1%
3500001
 
< 0.1%
5000003
 
< 0.1%
6000001
 
< 0.1%
7000003
 
< 0.1%
ValueCountFrequency (%)
8.63 × 10101
 
< 0.1%
6.6728 × 10101
 
< 0.1%
6 × 10102
 
< 0.1%
4.1610143 × 10102
 
< 0.1%
3.6110425 × 10105
< 0.1%
3.5084 × 10102
 
< 0.1%
2.6097952 × 10102
 
< 0.1%
2.576590619 × 10102
 
< 0.1%
2.469016564 × 10103
< 0.1%
2.3626255 × 10102
 
< 0.1%

MontoAnual__c
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct12
Distinct (%)44.4%
Missing37682
Missing (%)99.9%
Infinite0
Infinite (%)0.0%
Mean1857979.296
Minimum0
Maximum50000000
Zeros10
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size294.7 KiB
2022-06-08T14:33:29.545690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median600
Q318000
95-th percentile23981.5
Maximum50000000
Range50000000
Interquartile range (IQR)18000

Descriptive statistics

Standard deviation9621283.821
Coefficient of variation (CV)5.178359006
Kurtosis26.99995075
Mean1857979.296
Median Absolute Deviation (MAD)600
Skewness5.196145569
Sum50165441
Variance9.256910237 × 1013
MonotonicityNot monotonic
2022-06-08T14:33:29.610689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
010
 
< 0.1%
180004
 
< 0.1%
30003
 
< 0.1%
216052
 
< 0.1%
100001
 
< 0.1%
55251
 
< 0.1%
6001
 
< 0.1%
500000001
 
< 0.1%
250001
 
< 0.1%
51
 
< 0.1%
Other values (2)2
 
< 0.1%
(Missing)37682
99.9%
ValueCountFrequency (%)
010
< 0.1%
11
 
< 0.1%
51
 
< 0.1%
1001
 
< 0.1%
6001
 
< 0.1%
30003
 
< 0.1%
55251
 
< 0.1%
100001
 
< 0.1%
180004
 
< 0.1%
216052
 
< 0.1%
ValueCountFrequency (%)
500000001
 
< 0.1%
250001
 
< 0.1%
216052
< 0.1%
180004
< 0.1%
100001
 
< 0.1%
55251
 
< 0.1%
30003
< 0.1%
6001
 
< 0.1%
1001
 
< 0.1%
51
 
< 0.1%

OtrosIngresos__c
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
SKEWED
ZEROS

Distinct292
Distinct (%)1.4%
Missing16905
Missing (%)44.8%
Infinite0
Infinite (%)0.0%
Mean2793906.203
Minimum0
Maximum8400000000
Zeros19756
Zeros (%)52.4%
Negative0
Negative (%)0.0%
Memory size294.7 KiB
2022-06-08T14:33:29.689689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile200000
Maximum8400000000
Range8400000000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation74595168.14
Coefficient of variation (CV)26.69923853
Kurtosis8216.563948
Mean2793906.203
Median Absolute Deviation (MAD)0
Skewness81.18737292
Sum5.812442464 × 1010
Variance5.56443911 × 1015
MonotonicityNot monotonic
2022-06-08T14:33:29.778190image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
019756
52.4%
1200000058
 
0.2%
1000000040
 
0.1%
600000035
 
0.1%
3000000032
 
0.1%
2400000031
 
0.1%
500000027
 
0.1%
1800000022
 
0.1%
2000000019
 
0.1%
1500000018
 
< 0.1%
Other values (282)766
 
2.0%
(Missing)16905
44.8%
ValueCountFrequency (%)
019756
52.4%
95831
 
< 0.1%
240002
 
< 0.1%
1780002
 
< 0.1%
1830001
 
< 0.1%
2000006
 
< 0.1%
2011052
 
< 0.1%
2280001
 
< 0.1%
2390001
 
< 0.1%
2400002
 
< 0.1%
ValueCountFrequency (%)
84000000001
 
< 0.1%
34390000002
 
< 0.1%
16648010005
< 0.1%
9360540001
 
< 0.1%
9287370002
 
< 0.1%
8624235002
 
< 0.1%
5909600003
< 0.1%
4368380001
 
< 0.1%
3981889681
 
< 0.1%
2683460003
< 0.1%

Profesion__pc
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing37709
Missing (%)100.0%
Memory size294.7 KiB

EgresosAnuales__c
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct2996
Distinct (%)13.0%
Missing14609
Missing (%)38.7%
Infinite0
Infinite (%)0.0%
Mean153383472.9
Minimum0
Maximum3.6967344 × 1010
Zeros15
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size294.7 KiB
2022-06-08T14:33:29.870689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3000000
Q112000000
median24000000
Q350000000
95-th percentile401471000
Maximum3.6967344 × 1010
Range3.6967344 × 1010
Interquartile range (IQR)38000000

Descriptive statistics

Standard deviation1013404716
Coefficient of variation (CV)6.607000724
Kurtosis471.1053889
Mean153383472.9
Median Absolute Deviation (MAD)14000000
Skewness18.8050108
Sum3.543158223 × 1012
Variance1.026989119 × 1018
MonotonicityNot monotonic
2022-06-08T14:33:29.957689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
120000001200
 
3.2%
300000001011
 
2.7%
24000000949
 
2.5%
18000000903
 
2.4%
20000000884
 
2.3%
36000000544
 
1.4%
10000000519
 
1.4%
15000000517
 
1.4%
40000000511
 
1.4%
25000000445
 
1.2%
Other values (2986)15617
41.4%
(Missing)14609
38.7%
ValueCountFrequency (%)
015
 
< 0.1%
1153
0.4%
108
 
< 0.1%
181
 
< 0.1%
1001
 
< 0.1%
2042
 
< 0.1%
200001
 
< 0.1%
500002
 
< 0.1%
700001
 
< 0.1%
720003
 
< 0.1%
ValueCountFrequency (%)
3.6967344 × 10102
 
< 0.1%
3.0868341 × 10105
< 0.1%
2.2582482 × 10102
 
< 0.1%
2.166849912 × 10104
< 0.1%
2.1322738 × 10102
 
< 0.1%
1.9746108 × 10102
 
< 0.1%
1.923885007 × 10103
< 0.1%
1.8035021 × 10101
 
< 0.1%
1.629516502 × 10102
 
< 0.1%
1.4868 × 10102
 
< 0.1%

EstadoCivil__pc
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct8
Distinct (%)< 0.1%
Missing14347
Missing (%)38.0%
Memory size1.8 MiB
SOLTERO
10974 
CASADO
10226 
OTRO
1640 
UNIDO
 
307
VIUDO
 
81
Other values (3)
 
134

Length

Max length10
Median length8
Mean length6.324929372
Min length3

Characters and Unicode

Total characters147763
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSOLTERO
2nd rowCASADO
3rd rowSOLTERO
4th rowCASADO
5th rowSOLTERO

Common Values

ValueCountFrequency (%)
SOLTERO10974
29.1%
CASADO10226
27.1%
OTRO1640
 
4.3%
UNIDO307
 
0.8%
VIUDO81
 
0.2%
SEPARADO73
 
0.2%
DIVORCIADO46
 
0.1%
N A15
 
< 0.1%
(Missing)14347
38.0%

Length

2022-06-08T14:33:30.045189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-08T14:33:30.124689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
soltero10974
46.9%
casado10226
43.7%
otro1640
 
7.0%
unido307
 
1.3%
viudo81
 
0.3%
separado73
 
0.3%
divorciado46
 
0.2%
n15
 
0.1%
a15
 
0.1%

Most occurring characters

ValueCountFrequency (%)
O36007
24.4%
S21273
14.4%
A20659
14.0%
R12733
 
8.6%
T12614
 
8.5%
E11047
 
7.5%
L10974
 
7.4%
D10779
 
7.3%
C10272
 
7.0%
I480
 
0.3%
Other values (5)925
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter147748
> 99.9%
Space Separator15
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O36007
24.4%
S21273
14.4%
A20659
14.0%
R12733
 
8.6%
T12614
 
8.5%
E11047
 
7.5%
L10974
 
7.4%
D10779
 
7.3%
C10272
 
7.0%
I480
 
0.3%
Other values (4)910
 
0.6%
Space Separator
ValueCountFrequency (%)
15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin147748
> 99.9%
Common15
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
O36007
24.4%
S21273
14.4%
A20659
14.0%
R12733
 
8.6%
T12614
 
8.5%
E11047
 
7.5%
L10974
 
7.4%
D10779
 
7.3%
C10272
 
7.0%
I480
 
0.3%
Other values (4)910
 
0.6%
Common
ValueCountFrequency (%)
15
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII147763
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O36007
24.4%
S21273
14.4%
A20659
14.0%
R12733
 
8.6%
T12614
 
8.5%
E11047
 
7.5%
L10974
 
7.4%
D10779
 
7.3%
C10272
 
7.0%
I480
 
0.3%
Other values (5)925
 
0.6%

Genero__pc
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct3
Distinct (%)< 0.1%
Missing14347
Missing (%)38.0%
Memory size1.9 MiB
MASCULINO
17166 
FEMENINO
6188 
N A
 
8

Length

Max length9
Median length9
Mean length8.733070799
Min length3

Characters and Unicode

Total characters204022
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMASCULINO
2nd rowFEMENINO
3rd rowMASCULINO
4th rowMASCULINO
5th rowFEMENINO

Common Values

ValueCountFrequency (%)
MASCULINO17166
45.5%
FEMENINO6188
 
16.4%
N A8
 
< 0.1%
(Missing)14347
38.0%

Length

2022-06-08T14:33:30.197189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-08T14:33:30.265690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
masculino17166
73.5%
femenino6188
 
26.5%
n8
 
< 0.1%
a8
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N29550
14.5%
M23354
11.4%
I23354
11.4%
O23354
11.4%
A17174
8.4%
S17166
8.4%
C17166
8.4%
U17166
8.4%
L17166
8.4%
E12376
6.1%
Other values (2)6196
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter204014
> 99.9%
Space Separator8
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N29550
14.5%
M23354
11.4%
I23354
11.4%
O23354
11.4%
A17174
8.4%
S17166
8.4%
C17166
8.4%
U17166
8.4%
L17166
8.4%
E12376
6.1%
Space Separator
ValueCountFrequency (%)
8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin204014
> 99.9%
Common8
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
N29550
14.5%
M23354
11.4%
I23354
11.4%
O23354
11.4%
A17174
8.4%
S17166
8.4%
C17166
8.4%
U17166
8.4%
L17166
8.4%
E12376
6.1%
Common
ValueCountFrequency (%)
8
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII204022
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N29550
14.5%
M23354
11.4%
I23354
11.4%
O23354
11.4%
A17174
8.4%
S17166
8.4%
C17166
8.4%
U17166
8.4%
L17166
8.4%
E12376
6.1%
Other values (2)6196
 
3.0%

ciudad_name
Categorical

HIGH CORRELATION
MISSING

Distinct22
Distinct (%)0.1%
Missing14347
Missing (%)38.0%
Memory size1.9 MiB
otras
18170 
BOGOTÁ D.C.
 
1726
MEDELLIN
 
738
CALI
 
688
CARTAGENA
 
195
Other values (17)
1845 

Length

Max length13
Median length5
Mean length5.792825957
Min length4

Characters and Unicode

Total characters135332
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowotras
2nd rowotras
3rd rowotras
4th rowPASTO
5th rowotras

Common Values

ValueCountFrequency (%)
otras18170
48.2%
BOGOTÁ D.C.1726
 
4.6%
MEDELLIN738
 
2.0%
CALI688
 
1.8%
CARTAGENA195
 
0.5%
ARMENIA192
 
0.5%
VILLAVICENCIO189
 
0.5%
BUCARAMANGA184
 
0.5%
PASTO182
 
0.5%
YOPAL173
 
0.5%
Other values (12)925
 
2.5%
(Missing)14347
38.0%

Length

2022-06-08T14:33:30.329189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
otras18170
72.0%
bogotá1726
 
6.8%
d.c1726
 
6.8%
medellin738
 
2.9%
cali688
 
2.7%
cartagena195
 
0.8%
armenia192
 
0.8%
villavicencio189
 
0.7%
bucaramanga184
 
0.7%
pasto182
 
0.7%
Other values (17)1241
 
4.9%

Most occurring characters

ValueCountFrequency (%)
o18170
13.4%
r18170
13.4%
a18170
13.4%
s18170
13.4%
t18170
13.4%
A4333
 
3.2%
O4228
 
3.1%
C3520
 
2.6%
.3452
 
2.6%
L3201
 
2.4%
Other values (21)25748
19.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter90850
67.1%
Uppercase Letter39161
28.9%
Other Punctuation3452
 
2.6%
Space Separator1869
 
1.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A4333
11.1%
O4228
10.8%
C3520
 
9.0%
L3201
 
8.2%
I2872
 
7.3%
E2682
 
6.8%
D2600
 
6.6%
T2283
 
5.8%
G2222
 
5.7%
B2079
 
5.3%
Other values (14)9141
23.3%
Lowercase Letter
ValueCountFrequency (%)
o18170
20.0%
r18170
20.0%
a18170
20.0%
s18170
20.0%
t18170
20.0%
Other Punctuation
ValueCountFrequency (%)
.3452
100.0%
Space Separator
ValueCountFrequency (%)
1869
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin130011
96.1%
Common5321
 
3.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
o18170
14.0%
r18170
14.0%
a18170
14.0%
s18170
14.0%
t18170
14.0%
A4333
 
3.3%
O4228
 
3.3%
C3520
 
2.7%
L3201
 
2.5%
I2872
 
2.2%
Other values (19)21007
16.2%
Common
ValueCountFrequency (%)
.3452
64.9%
1869
35.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII133337
98.5%
None1995
 
1.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o18170
13.6%
r18170
13.6%
a18170
13.6%
s18170
13.6%
t18170
13.6%
A4333
 
3.2%
O4228
 
3.2%
C3520
 
2.6%
.3452
 
2.6%
L3201
 
2.4%
Other values (18)23753
17.8%
None
ValueCountFrequency (%)
Á1810
90.7%
Ú125
 
6.3%
É60
 
3.0%

edad
Real number (ℝ≥0)

MISSING

Distinct10412
Distinct (%)44.7%
Missing14420
Missing (%)38.2%
Infinite0
Infinite (%)0.0%
Mean50.56917858
Minimum1.4
Maximum122.5150685
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size294.7 KiB
2022-06-08T14:33:30.407188image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1.4
5-th percentile29.23013699
Q139.89041096
median49.82191781
Q360.46027397
95-th percentile74.17589041
Maximum122.5150685
Range121.1150685
Interquartile range (IQR)20.56986301

Descriptive statistics

Standard deviation14.2289308
Coefficient of variation (CV)0.2813755571
Kurtosis0.4070367718
Mean50.56917858
Median Absolute Deviation (MAD)10.25479452
Skewness0.3681551313
Sum1177705.6
Variance202.4624716
MonotonicityNot monotonic
2022-06-08T14:33:30.500689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
46.6931506880
 
0.2%
47.5205479577
 
0.2%
42.463013746
 
0.1%
122.515068531
 
0.1%
43.6904109622
 
0.1%
41.5863013721
 
0.1%
57.7123287720
 
0.1%
36.5123287720
 
0.1%
51.6904109619
 
0.1%
60.5589041119
 
0.1%
Other values (10402)22934
60.8%
(Missing)14420
38.2%
ValueCountFrequency (%)
1.41
< 0.1%
3.3780821921
< 0.1%
3.4109589041
< 0.1%
4.2356164381
< 0.1%
4.2602739731
< 0.1%
4.3397260271
< 0.1%
4.3534246582
< 0.1%
4.41
< 0.1%
4.4328767122
< 0.1%
4.7671232881
< 0.1%
ValueCountFrequency (%)
122.515068531
0.1%
104.95890412
 
< 0.1%
103.07671233
 
< 0.1%
100.02191781
 
< 0.1%
98.501369862
 
< 0.1%
96.367123292
 
< 0.1%
96.298630142
 
< 0.1%
96.109589041
 
< 0.1%
95.520547952
 
< 0.1%
95.4383561614
< 0.1%

Interactions

2022-06-08T14:33:09.169187image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:27:39.489189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:28:11.528190image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:30:59.807689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:24.028690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:37.747189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:50.899690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:03.656188image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:20.758188image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:37.166189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:37.975695image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:52.644690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:10.628187image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:27:41.636191image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:28:31.013690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:01.320189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:25.183188image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:38.909689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:52.072190image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:05.095690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:22.195693image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:37.244187image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:39.376189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:54.980189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:25.088689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:28:04.105192image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:29:12.449190image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:23.215189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:36.939689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:50.083689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:02.865694image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:19.004689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:35.423188image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:37.348689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:51.851693image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:08.317689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:25.167190image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:28:04.795188image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:29:32.634689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:23.299688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:37.025189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:50.168689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:02.947189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:19.083187image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:35.500188image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:37.415189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:51.931690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:08.400689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:25.249189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:28:05.342688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:29:42.710690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:23.384189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:37.112190image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:50.257189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:03.029696image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:19.163688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:35.578689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:37.476188image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:52.012192image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:08.488190image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:25.332690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:28:07.397692image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:29:52.451189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:23.467189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:37.196689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:50.341689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:03.109688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:19.246189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:35.658190image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:37.534189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:52.093193image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:08.574189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:25.432189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:28:07.943189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:30:00.961189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:23.549192image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:37.276689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:50.421189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:03.184689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:19.333191image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:35.742189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:37.595689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:52.177690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:08.664690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:25.516690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:28:08.606189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:30:12.737192image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:23.631690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:37.356189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:50.501690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:03.264688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:20.366189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:35.822190image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:37.655189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:52.257192image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:08.750690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:25.590189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:28:09.255689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:30:24.592189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:23.706690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:37.432689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:50.582186image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:03.342189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:20.439687image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:35.894689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:37.716689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:52.328689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:08.830187image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:25.656690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:28:09.354189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:30:24.725688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:23.777190image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:37.494690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:50.643688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:03.403689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:20.506189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:35.962689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:37.778189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:52.393689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:08.900688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:25.736688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:28:10.004689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:30:35.643189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:23.855189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:37.572688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:50.722689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:03.480689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:20.584198image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:36.040190image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:37.838189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:52.471690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:08.983689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:25.824689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:28:10.801689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:30:47.738189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:23.941188image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:37.657689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:31:50.809191image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:03.563689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:20.669688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:36.125689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:37.903687image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:32:52.556189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-08T14:33:09.075196image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2022-06-08T14:33:30.584690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-06-08T14:33:30.722189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-06-08T14:33:30.855689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-06-08T14:33:30.987189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-06-08T14:33:31.108688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-06-08T14:33:26.035689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-06-08T14:33:26.555689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-06-08T14:33:26.872688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-06-08T14:33:27.093189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

CodigoTipoAsegurado__cPuntoVenta__ctipo_ramo_nametipo_prod_descClaseVehiculo__cMarcaVehiculo__cMdeloVehiculo__cTipoVehiculo__cNumeroPoliza__cFechaInicioVigencia__ctrimchurnn_prod_prevtotal_siniestrostotal_pagado_smmlvanios_ultimo_siniestroActivos__cAnnualRevenueMontoAnual__cOtrosIngresos__cProfesion__pcEgresosAnuales__cEstadoCivil__pcGenero__pcciudad_nameedad
01404previhogarprevihogar99999NaNNaN99999100365801-201802.0NaNNaNNaN1.320000e+087.000000e+05NaNNaNNaN1.000000e+00SOLTEROMASCULINOotras49.230137
11103previhogarprevihogar99999NaNNaN99999100224501-201802.0NaNNaNNaN3.000000e+072.090000e+07NaNNaNNaN1.820000e+07CASADOFEMENINOotras83.964384
215previhogarprevihogar99999NaNNaN99999100005901-201812.01.00.0000001.7397263.434361e+097.238820e+08NaN0.0NaN5.662310e+08SOLTEROMASCULINOotras37.361644
311402previhogarprevihogar99999NaNNaN99999100059601-201814.01.01.5671252.4000004.260000e+081.123034e+08NaN100000000.0NaN9.600000e+07CASADOMASCULINOPASTO50.134247
417002previhogarprevihogar99999NaNNaN99999100113001-201812.0NaNNaNNaN6.000000e+072.900000e+07NaNNaNNaN2.900000e+07SOLTEROFEMENINOotras37.079452
517007previhogarprevihogar99999NaNNaN99999100113101-201816.0NaNNaNNaN2.068315e+087.445627e+07NaN0.0NaN6.645600e+07OTROMASCULINOBOGOTÁ D.C.38.356164
61404previhogarprevihogar99999NaNNaN99999100343301-201802.0NaNNaNNaN6.000000e+084.900000e+07NaNNaNNaN3.500000e+07OTROFEMENINOotras78.435616
71404previhogarprevihogar99999NaNNaN99999100323601-201812.0NaNNaNNaN1.700000e+083.600000e+07NaNNaNNaN2.400000e+07OTROFEMENINOCALI72.476712
81404previhogarprevihogar99999NaNNaN99999100309001-201801.0NaNNaNNaN1.360000e+081.450000e+07NaN9000000.0NaN2.350000e+07CASADOMASCULINOCALI82.273973
91404previhogarprevihogar99999NaNNaN99999100183701-201802.0NaNNaNNaN2.361596e+092.776884e+09NaNNaNNaN1.853277e+09OTROMASCULINOCALI63.600000

Last rows

CodigoTipoAsegurado__cPuntoVenta__ctipo_ramo_nametipo_prod_descClaseVehiculo__cMarcaVehiculo__cMdeloVehiculo__cTipoVehiculo__cNumeroPoliza__cFechaInicioVigencia__ctrimchurnn_prod_prevtotal_siniestrostotal_pagado_smmlvanios_ultimo_siniestroActivos__cAnnualRevenueMontoAnual__cOtrosIngresos__cProfesion__pcEgresosAnuales__cEstadoCivil__pcGenero__pcciudad_nameedad
3769913301responsabilidad civilprofesionales medicos99999NaNNaN99999100369302-20211NaNNaNNaNNaN324578000.083160000.0NaN0.0NaN1.0OTROMASCULINOotras61.054795
3770013301responsabilidad civildirectores y administradores99999NaNNaN99999102759702-20211NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
3770111048responsabilidad civildirectores y administradores99999NaNNaN99999102009902-20211NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
3770211048responsabilidad civildirectores y administradores99999NaNNaN99999102355502-20211NaN2.01.4151210.394521NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
3770313301responsabilidad civilprofesionales medicos99999NaNNaN99999102740002-20211NaNNaNNaNNaN18000000.060000000.0NaN0.0NaN58000000.0CASADOMASCULINOBOGOTÁ D.C.NaN
3770413301responsabilidad civildirectores y administradores99999NaNNaN99999102740302-2021114.0704.05779.8093770.060274NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
3770533202responsabilidad civilprofesionales medicos99999NaNNaN99999106011302-20211NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
3770618001responsabilidad civilprofesionales medicos99999NaNNaN99999102725502-20211NaNNaNNaNNaN500000000.090000000.0NaN0.0NaN65000000.0SOLTEROMASCULINOBOGOTÁ D.C.122.515068
3770718001responsabilidad civildirectores y administradores99999NaNNaN99999102733102-20210NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
3770818001responsabilidad civildirectores y administradores99999NaNNaN99999102742802-20211NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN